What are factor models?

By Evgenia "Jenny" Nitishinskaya and Delaney Granizo-Mackenzie

Notebook released under the Creative Commons Attribution 4.0 License.


(Linear) factor models try to express a rate of return of an asset $i$ in terms of some factors as

$$R_i = a_i + b_{i1} F_1 + b_{i2} F_2 + \ldots + b_{iK} F_K + \epsilon_i$$

where the variables $F_j$ are called factors and the coefficients $b_{ij}$ are called factor sensitivities. The error term $\epsilon_i$ has mean zero and represents the asset-specific random fluctuation, which can be eliminated away by combining diverse assets into a portfolio. Notice that the factors are not indexed by $i$. This reflects the fact that factor models are intended to be applicable for different return streams, and are not asset-specific. Therefore we can use one model to evaluate or predict the returns on different assets and portfolios.

What are the factors that we use in the model? One possibility is the return on the market: a model with only this one factor is what we use for beta hedging. Other potential factors include inflation, changes in industrial production, and an asset's price-to-book ratio. We discuss the different kinds of factors in more detail below, but the overarching principle is that, as with regression models we have conisered in the past, we should have an economic reason to believe that the factors we choose might explain returns. We also want factors that explain the returns of many assets.

To get the model for a portfolio, we simply take the weighted average of the models for the constituent assets. If our portfolio is $P = \sum_{i=1}^n w_i x_i$, then our portfolio model is

$$ R_p = \sum_{i=1}^n w_i R_i = \sum_{i=1}^n w_i a_i + \left(\sum_{i=1}^n w_i b_{i1} \right) F_1 + \left(\sum_{i=1}^n w_i b_{i2} \right) F_2 + \ldots + \left(\sum_{i=1}^n w_i b_{iK} \right) F_K + \sum_{i=1}^n w_i \epsilon_i $$

For factor analysis to be valid, we need the error term $\epsilon_i$ to be uncorrelated across assets $i$ (so that it does in fact represent asset-specific risk), and uncorrelated with the factors $F_1,\ldots, F_k$ (if it is correlated with some factor $F_j$, then we have chosen $b_{ij}$ incorrectly, which has forced the error term to compensate).

Types of factor models

Factor models can be categorized into three main groups, depending on the types of factors they use. There are also mixed factor models, which combine factors from different groups.

Macroeconomic factor models

In this model, the variable $a_i$ is the expected return, and the factors are surprises (actual value minus expected value) in macroeconomic variables. That is, we have some prediction $a_i$ for the return which depends on our predictions of variables like inflation, and surprises in these variables will cause the actual returns to deviate from the expectation. So, the actual return is broken down into the expected return, unexpected return resulting from unexpected changes in the factors, and an error term. If we take the expected value of both sides, we get $a_i$, since by definition $F_j$ and $\epsilon_i$ have expected value 0:

\begin{align} E(R_i) &= E(a_i + b_{i1} F_1 + b_{i2} F_2 + \ldots + b_{iK} F_K + \epsilon_i) \\ &= E(a_i) + E(b_{i1} F_1) + E(b_{i2} F_2) + \ldots + E(b_{iK} F_K) + E(\epsilon_i) \\ &= a_i + b_{i1} E(F_1) + b_{i2} E(F_2) + \ldots + b_{iK} E(F_K) + 0 \\ a_i &= a_i + 0 + 0 + \ldots + 0 + 0 \end{align}

We use a regression to calculate the coefficients $b_{ij}$ (one regression for each asset $i$, using historical data). The meanings of these factor sensitivities are the same as in any linear regression: we predict that every unit of surprise in variable $j$, which is an increase of $F_j$ by one unit, will cause an increase of $b_{ik}$ in the returns of stock $i$, assuming that the other factors are held constant.

Fundamental factor models

Fundamentals are data having to do with the asset issuer, like the sector, size, and expenses of the company. In the fundamental factor model, we use the same equation as before, but interpret the terms differently. The factors $F_j$ represent the returns associated with some fundamental characteristics.

Statistical factor models

This type of model computes the factors that best describe returns using statistical methods, without specifying what the factors represent. The two main methods for obtaining the factors are principal component analysis (resulting in factors that are the portfolios best explaining historical return variances) and factor analysis (factors are portfolios that best explain historical return covariances). After we compute the portfolios we want to use as factors, we can estimate the loadings with linear regression.

Currently used models

There are many factor models in use in academia and industry today, whose factors were chosen based on extensive research. Among these are the BIRR model (which is macroeconomic) and the Fama-French three-factor model (which is a mixed model, using market cap, P/B ratio, and market returns).